home *** CD-ROM | disk | FTP | other *** search
-
- FastVid 1.03. Copyright 1996 by John Hinkley. 72466.1403@compuserve.com
-
- --------------------------------------------------------------------------
- WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
- --------------------------------------------------------------------------
-
- THIS PROGRAM WILL ONLY WORK ON PENTIUM PRO PROCESSORS. IT WILL NOT WORK,
- AND IS NOT NEEDED, ON PENTIUM AND EARLIER CPU's.
-
- According to Intel, enabling Write Posting (see below) on 82450 steppings
- before B0 could result in "rare" problems on the PCI bus. If the program
- indicates that Write Posting is enabled when you first run this program
- (the "Before" message) then you have a B0 or later stepping of the 82450
- and don't need to worry about the A2 bugs. The problem will manifest
- itself when there are high levels of traffic on the PCI bus -- multiple
- devices reading and writing at the same time. A typical example where you
- might have problems is when playing multimedia files like AVI and MPEG
- animations. The write combining options of this program can be used
- without problems on any version of the 82450.
-
- If you have a pre-B0 motherboard you may want to play around with write
- posting to see the difference it makes but you shouldn't enable it all the
- time -- it WILL occasionally lock up your computer. If you really want or
- need write posting you should consider getting a new motherboard.
-
- Be forewarned: YOU USE THIS PROGRAM AT YOUR OWN RISK.
-
- --------------------------------------------------------------------------
- End of Warning.
- --------------------------------------------------------------------------
-
- This program enables Write Posting, banked VGA Write Combining and SVGA
- linear frame buffer Write Combining on Pentium Pro motherboards based on
- the 82450 and 82440 chipset. This will significantly improve graphic
- performance from DOS and Win95.
-
- The program must execute privileged instructions so it must be run in real
- mode. For the time being that means it must be run from DOS. You cannot
- run it from a DOS window or a full screen DOS session from Windows 3.x,
- Win95, WinNT or OS/2. For DOS, Windows 3.x and Win95 you can include the
- program in your AUTOEXEC.BAT file (keep in mind that DOS4GW.EXE must be in
- your search path). If you try to run the program from a protected mode OS
- you will get a DOS4GW error message and register dump.
-
- --------------------------------------------------------------------------
-
- Steppings of the 82450 chipset before B0 have bugs which forced Intel to
- disable Write Posting -- in essence cache writes have been disabled for
- the PCI bus. The B0 stepping has been fixed and Write Posting is enabled
- by default by the BIOS. The difference is easily visible in writing to
- video memory. An A2 motherboard can only write about 8MB/sec to the
- graphics card, a B0 motherboard gets about 18MB/sec. The 82440 chipset
- does not have this problem and virtually all 82440 BIOS's enable write
- posting by default.
-
- But this is not the entire story. With the Pentium Pro, Intel also
- decided that the enabling of Write Combining (the combining of several
- writes into a cache line that can be bursted out the PCI bus) should be
- the responsibility of the O/S, not the BIOS. By enabling Write Combining
- the throughput to video RAM can be further increased to 88MB/sec or more.
-
- There are two mechanisms for which Write Combining needs to be enabled:
- the banked VGA mechanism (the 128KB from A0000 to BFFFF) and the unbanked,
- linear frame buffer that all of the newer graphic cards support. I will
- refer henceforth refer to linear frame buffer write combining as LFBWC and
- banked VGA write combining as BVWC.
-
- Most low resolution DOS graphics applications and games use the banked
- mechanism. Since the VESA committee has defined a standard, and UNIVBE
- and a few of the graphic card manufacturers have provided the VESA 2.0
- services, some of the latest games use the linear frame buffer (Duke
- Nuke'm 3D and Quake are for example). The linear frame buffer usually
- gives better performance since it alleviates the reqirement to switch
- banks in hires modes.
-
- I have only personally tested this program with 2MB and 4MB Matrox MGA
- Millennium cards. It has been run by others with other cards (S3 964
- based, S3 968 based, Tseng 6000 based) and most benefit to some extent.
- The Number Nine Imagine 128 card does not seem to benefit under most
- circumstance and turns in dismal DOS video scores. Some users have
- reportet that starting Win95 after running VSPEED increases the LFB
- performance to the same range as most other hi performance cards.
-
- With a 2MB Millennium there were problems with BVWC -- most hires VESA
- modes would result in vertical stripes over the entire screen. This
- appears to be either a hardware or software bug on the part of Matrox. I
- found a workaround which eliminates the stripes but reduces BVGA
- performance a bit. The LFB will still run at full speed. Using a
- negative value for the number of megabytes (for example "FASTVID x11 -2")
- will enable this workaround.
-
- I haven't tested FASTVID on any 8MB graphic cards but I think it will work
- properly. Please let me know if you find othewise.
-
- On many graphic cards, enabling the BVWC results in problems with some
- programs that use VGA mode 0x12 (640x480x16colors). This appears to be
- either a hardware or software problem on the part of manufacturers. The
- problem is only aparent with BVWC so you can run with that disabled if
- necessary. For example use "FASTVID x01". Note that this is not the same
- as the "vertical stripe" problem mentioned above.
-
- Unfortunately, I have found that EMM386 (and other memory managers like
- QEMM and 386MAX) interferes in some way with LFBWC. When running DOS you
- must remove EMM386 from your CONFIG.SYS file for LFBWC to work (BVWC is
- not affected). If EMM386 is loaded you will see no increase in speed of
- the linear frame buffer. In most cases, if you use a memory manager,
- running Win95 will re-enable the LFBWC.
-
- LFB write combining requires that FASTVID know where the linear frame
- buffer is located. Different graphic card manufacturers put it at
- different addresses. The LFBWC code in FASTVID currently queries any
- installed VESA BIOS Extension driver for the LFB address so you should
- install your VESA driver before FASTVID. If you don't have a VESA driver
- loaded (keep in mind that many cards have the driver in BIOS so you don't
- need to explicitly load one) or your VESA driver doesn't support the LFB,
- FASDVID will examine the PIC configuration registers for the LFB address.
- This isn't always successful -- there are several registers and different
- cards use different ones. If none of these methods works you will have to
- supply an LFB address. Theoretically this program will work for any LFB
- address but I have only personally tested and verified that it works for
- the Matrox MGA Millennium at 0xFF000000. Others have successfully used it
- with other graphics cards at other addresses. If you supply an incorrect
- LFB address you will not see any increase in speed of the LFBWC.
-
- If FASTVID can't automatically detect the LFB address, you can determine
- it's location from Win95. Select Start, Settings, Control Panel, (or My
- Computer, Control Panel), System, Device Manager, Display Adaptors, your
- graphics card, Resources. Scroll to the bottom of the Resource Settings
- box and you will see a line (or a few lines) that reads: "Memory Range
- XXXXXXXX - YYYYYYYY". The first value is the location of the linear frame
- buffer. For the Matrox MGA Millennium it is 0xFF000000. If you have
- another address take note of it and input it into FASTVID when asked. If
- there are several address try each of them. Ignore the following memory
- regions: A0000-AFFFF, B0000-BFFFF, C0000-CFFFF.
-
- --------------------------------------------------------------------------
-
- Usage: FASTVID XYZ N ADDRESS
-
- X controls Write Posting.
- Y controls VGA (banked) Write Combining.
- Z controls SVGA (linear frame buffer) Write Combining.
- For all three, 0 disables, 1 enables, any other value
- results in no change from the current setting.
- N indicates the amount of video memory in MegaBytes.
- Valid values are 2, 4, and 8. Also valid are -2, -4,
- and -8 to apply the special "vertical stripe" patch.
- ADDRESS is the address of the linear frame buffer in hex.
- The Matrox MGA Millennium has it at FF000000.
-
- Example 1: FASTVID
-
- If no arguments are supplied you run through a question and answer
- dialogue and the program sets up the environment. It will also
- tell you what the equivalent command line is for the options you
- chose.
-
- Example 2: FASTVID 111 4
-
- Write posting is enabled (82450 only).
- VGA Write Combining is enabled.
- SVGA Write Combining is enabled, FASTVID will attemp to locate
- the LFB on it's own.
-
- Example 3: FASTVID x01 4 FF000000
-
- The write posting setting is not changed by FASTVID.
- VGA Write Combining is disabled.
- SVGA Write Combining is enabled for 4MB video memory at FF000000.
-
- Example 4: FASTVID 111 -2 FF000000
-
- Write posting is enabled (82450 only).
- VGA Write Combining is enabled. "Vertical stripe" patch applied.
- SVGA Write Combining is enabled for 2MB video memory at FF000000.
-
- --------------------------------------------------------------------------
-
- Included is a test program called VSPEED.EXE that reports the video
- throughput for bit blit operations from DRAM to VRAM for both the banked
- VGA and linear frame buffer mechanisms.
-
- If you experience difficulties with VSPEED (the program locks up or
- crashes) try using -l or -L on the command line to eliminate the linear
- frame buffer test. For example:
-
- VSPEED -l
-
- will test only the banked VGA mechanism.
-
- VSPEED
-
- will test both the banked VGA and the linear frame buffer (assuming the
- card and VESA driver support it).
-
- --------------------------------------------------------------------------
-
- Sample VSPEED results from an Intel Aurora motherboard with the B0
- stepping of the 82450 and a 4MB Matrox MGA Millennium:
-
- FASTVID 000
- Copy DRAM to banked VGA: 8.07 million bytes per second
- Copy DRAM to linear framebuffer: 8.14 million bytes per second
-
- FASTVID 100
- Copy DRAM to banked VGA: 18.72 million bytes per second
- Copy DRAM to linear framebuffer: 18.91 million bytes per second
-
- FASTVID 011
- Copy DRAM to banked VGA: 37.95 million bytes per second
- Copy DRAM to linear framebuffer: 39.60 million bytes per second
-
- FASTVID 111
- Copy DRAM to banked VGA: 87.72 million bytes per second
- Copy DRAM to linear framebuffer: 93.46 million bytes per second
-
- FASTVID 111 -2
- Copy DRAM to banked VGA: 49.20 million bytes per second
- Copy DRAM to linear framebuffer: 93.46 million bytes per second
-
- --------------------------------------------------------------------------
-
- The following tests were run on an Intel Aurora motherboard with the B0
- stepping of the 82450, 64MB DRAM (all four SIMM sockets populated), and a
- 4MB Matrox MGA Millennium. The "000" setting simulates an A2 motherboard
- where Write Posting is disabled.
-
- --------------------------------------------------------------------------
- program: fastvid setting: 000 100 011 111
- --------------------------------------------------------------------------
-
- VSPEED (LFB, million bytes/sec) 8 19 40 93
- Duke Nuke'm 3D (640x480, fps) 14 25 18 31
- Doom Benchmark (fps) 38 70 48 74
- 640x480 FLC animation (fps) 25 48 88 121
- Chris's 3D benchmark (SVGA) 21 38 66 77
-
- Note that differences in motherboard and graphic card design may lead to
- different results. Most notably, some cards cannot sustain 93MB/sec in
- the VSPEED test. 82440 base motherboards usually perform a bit better in
- real world applications because of better DRAM and PCI access times.
-
- The above are all DOS applications. If you have an A2 motherboard turning
- on write posting will increase the WinBench96 Graphic Winmark score by
- about 25 percent. The write combining features don't make much difference
- to the Graphic Winmark score but there _are_ circumstance where write
- combining can make a big difference. One example is using the Media
- Player to play an animation to a high resolution, highcolor or truecolor
- window. For example:
-
- Run Win95 in a high resolution, direct color mode; say 1024x768, 24bits
- per pixel. Start the Media Player. Open \FUNSTUFF\VIDEOS\WEEZER.AVI from
- the Win95 CD-ROM. Enlarge the playback window to nearly full screen (do
- not use the Media Player's "full screen" option -- if you do it will
- change the screen to a lower resolution 8 bit mode for playback). Press
- the Play button. With write posting and write combining turned off you
- will get very poor results, about 2 frames per second. With write posting
- on and write combining off that will improve to about 4 frames per second.
- With both write posting and write combining on you will get very smooth
- playback with the frame rate too fast to count.
-
- You can see similar affects with the Hover! game on the Win95 CD-ROM.
- Again, with Win95 in a hires direct color mode, enlarge the game window as
- large as it will let you (about 640x480). With write posting and write
- combinging off you will get poor performance. With write posting on the
- game will be playable. With write posting and write combining the action
- will be very smooth.
-
- If you have a pre-B0 motherboard you can still benefit from write
- combining (without fear of encountering the 82450 bugs) in the above DOS
- and Win95 situations. Just don't enable write posting.
-
- --------------------------------------------------------------------------
- further descriptions:
- --------------------------------------------------------------------------
-
- 1) Write Posting:
-
- Write Posting is where the processor "posts" data to the PCI bus and then
- goes on it's way without waiting for the write operation to complete.
- Because of bugs in the pre-B0 stepping of the 82450 chipset Write Posting is
- disabled on early Pentium Pro motherboards. This severly limits the PCI
- throughput to about 8MB/sec. Most Pentium motherboards these days can get
- over 80MB/sec, 10 times faster. FASTVID can enable Write Posting on these
- motherboards, increasing PCI throughput to about 18MB/sec. You don't want to
- do this routinely because the bugs in the chipset will eventualy cause the
- PCI bus to hang, forcing a reboot of the machine. Motherboards with the B0
- stepping have this bug fixed and Write Posting enabled by default.
-
-
- 2) Banked VGA Write Combining (BVGAWC):
-
- This function allows seperate writes to the banked VGA mechanism to be
- combined into a cacheline that can be bursted out to video memory via the PCI
- bus. I believe this used to be handled in hardware but Intel decided to make
- it a programable function with the Pentium Pro to make the motherboard
- architecture more general. If you enable BVGAWC with FASTVID PCI throughput
- will increase from 18MB/sec (B0 motherboard) to 90MB/sec for programs that
- use the banked VGA mechanism (most DOS games). If you enable only BVGAWC on
- an early motherboard (Write Posting remains off) the bus bandwidth increases
- from 8MB/sec to about 40MB/sec. Some of the newer motherboards (ASUS for
- instance) have this as a BIOS setup option.
-
-
- 3) Linear Frame Buffer Write Combining:
-
- Many newer graphics cards have their graphics memory mapped linearly at very
- high physical addresses (in addition to the banked VGA mechanism at A000:0000
- and B000:0000) beyond the 2GB mark. The reason for doing this is to make
- access to video memory simpler and faster -- programs (and Windows drivers)
- don't have to switch banks all the time to access all of video memory. I
- believe Pentium motherboards enable Write Combining for all high addresses
- but the Pentium Pro design requires the use of the processors MSR registers
- to enable Write Combining. Again, this was done to generalize the
- motherboard design. You can theoretically have multiple devices mapped in
- high address space with different cachability options. Intel believes that
- proper place for this to be handled is within a PNP operating system.
- Unfortunately, no operating system yet supports this. As with BVGAWC, LFBWC
- will increase PCI throughput from 18MB/sec to 90MB/sec (or 8MB/sec to
- 45MB/sec with Write Posting off) for programs that use the linear frame
- buffer (some of the new hires DOS games, Windows drivers).
-
-
- Exactly how much a difference any of these functions makes depends on the
- applications being run and the graphics card you're using. If you are using
- a very slow graphics card you won't see much difference. Programs that do
- very little graphic output will show little or no difference. Programs that
- do lots of graphic output (realtime 3D games, multimedia animations) can show
- a large difference. There are even circumstances under Win95, OS/2 and NT
- where the difference can be huge.
-
- Here are some results with my Matrox MGA Millennium:
-
- A2 B0 FASTVID
- ----------------------------------------------------------------
- Duke Nuke'm 3D (640x480, fps) 14 25 31
- Doom Benchmark (fps) 38 70 74
- 640x480 FLC animation (fps) 25 48 121
- Chris's 3D benchmark (SVGA) 21 38 77
- Win95 Media Player* (fps) 2 5 15**
-
- * WEEZER.AVI from the Win95 CD-ROM enlarged to _nearly_ full screen at
- 1152x864, 32 bits-per-pixel.
-
- ** the frame rate was too fast to count, 15fps is an estimate -- the
- animation played fairly smoothly.
-
- --------------------------------------------------------------------------
-
- Notes for 82440 (Natoma chipset) based motherboards:
-
- Several FASTVID users have reported that one BIOS setting on these
- motherboards conflicts with FASTVID resulting in a system crash. If you
- experience these crashes try turning off "USWC write posting" (Uncached
- Speculative Write Combining) using the BIOS setup procedure.
-
- FASTVID's controls for write-posting don't seem to have any effect on
- 82440 motherboards. Presumably this means that write-posting is
- controlled by a different mechanism. I suggest using "FASTVID x11" so
- that FASTVID doesn't attempt to change the write-posting option if you
- have one of these motherboards.
-
- --------------------------------------------------------------------------
-